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(S) Form identification and processing system. 

A method and an apparatus for identifying 
completed forms includes scanning a plurality 
of different blank forms, and creating hierarchi- 
cal profiles of each scanned blank form. Each 
hierarchical form profile is stored in a diction- 
ary. Once the fonn dictionary is created, a 
completed forni is scanned. A hierarchical pro- 
file of the completed form is created, and the 
hierarchical profile of the completed form is 
compared with stored hierarchical form pro- 
files. In accordance with the result of compari- 
son, one of the stored hierarchical form profiles 
is identified as conresponding to the completed 
form hierarchical profile. Based on the identity 
of~ the- corresponding hierarchical form profile, 
the completed form can be routed for further 
processing. A further aspect of the invention 
makes It possible to extract data from predesig- 
nated fields which may be unique to that par- 
ticular form within a completed forni based on 
the form's identity. Furthermore, by using the 
form dictionary, it is possible to identify a com- 
pleted form, extract data from the completed 
form, store the data with the fomn's identity, and 
display the completed form by drawing the 
identified form using the vectorized data from 
the form dictionary and superimposing the ext- 
racted data from the completed form into res- 
pective data fields. 
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The present invention relates to retrieval and to 
processing of data from completed printed forms. In 
particular, the present invention relates to a method 
and apparatus which uses feature extraction techni- 
ques to identify features in the completed forms and 
which, based on the features identified in the form, lo- 
cates desired data, extracts the desired data, stores 
the desired data in memory and, if necessary, dis- 
plays the extracted data to an operator for post- 
processing. 

Data entry and retrieval systems, such as a 
Document Image Management System, are widely 
used to enter data from completed printed document 
forms. Generally, such systems are designed to proc- 
ess data from various types of completed forms, such 
as credit forms, insurance forms, survey forms, hos- 
pital forms, etc. In order to process many different 
types of forms, system operators manually sort 
through the forms and separate the forms into batch- 
es of similar forms. Once the various types of forms 
have been sorted and separated into similar batches, 
a batch of the same type of completed form is distrit)- 
uted to a data entry operator for manual input of data 
from the completed form into a centra! processing 
system. During the data entry process, the operator 
reviews a completed form, determines the data which 
is to be manually keyed-in, and manually keys-in data 
from each completed form into the central processing 
system. 

In order to save time, a more sophisticated man- 
ner of paperless sorting has sometimes been adopt- 
ed. Using paperless sorting, a completed form is dig- 
itally scanned and a digital image of the form Is stored. 
Once the digital Image of the completed form is stored 
in memory, it can be identified by either data entry 
operators or by an automatic sorting process which 
first locates a coded "indicia field" such as a barcode 
field on theprinted form and subjects the indicia field 
to suitable processing. Once the completed form is 
recognized, the form is automatically sorted and rout- 
ed to the appropriate data entry operator workstation 
for further processing as described above. 

Despite the advantages of paperless sorting, 
both the manual system and the paperless sorting 
system suffer from disadvantages in that both sys- 
tems consume a great amount of time to sort and to 
manually enter data from the completed form. In ad- 
dition, companies which utilize automatic sorting 
practices by scanning all completed forms utilize 
large amounts of mass storage. These companies are 
therefore limited by the amount of mass storage avail- 
able for storing images of completed forms. Even 
though storing an entire image of a completed form 
cuts down on the number of man-hours used to man- 
ually sort through completed forms, the amount of 
mass storage increases exponentially. In this regard, 
since only a small portion of a completed form con- 
tains desired data, a large portion of memory is wast- 



ed by storing redundant elements in each completed 
form, such as the blank printed form itself, and cap- 
tions like name, date, address, etc. Image data which 
includes unused portions, or "null fields", and "white 
5 space" waste additional mass storage as well. Thus, 
a large portion of mass storage is utilized for useless 
data and/or non-data storage. 

Heretofore, it has not been possible to automati- 
cally input various types of completed printed forms. 

10 extract the desired data from the completed forms 
and. if desired, display only the completed data to an 
operator. That is. conventionally, once a completed 
document has been sorted and stored, an entire 
document must be displayed to an operator so that 

15 the operator can enter the desired data from the conv 
pleted form. Consequently, data entry and retrieval 
are both time-consuming and costly. 

It is an object of the present invention to address 
one or more of the foregoing difficulties. 

20 In one aspect of the present invention, a method 

for recognizing completed forms includes the steps of 
scanning a plurality of different types of blank printed 
forms, using feature extraction techniques to create 
a hierarchical prof rie for each scanned blank form, en- 

25 hancing and modifying each blank form profile by 
eliminating similarities between each blank form pro- 
file, and storing the enhanced blank form profiles in 
a form dictionary. In the method for recognizing com- 
pleted forms, a completed form is scanned and a hi- 

30 erarchical profile for the completed form is created 
using the same feature extraction techniques such as 
those used to create the forms dictionary. The com- 
pleted form profile is compared to each of the blank 
form profiles in the form dictionary until such time 

35 that the hierarchical profile of the completed form is 
identified as one of the blank forms from the blank 
form dictionary. 

In a related aspect of the invention, there is a 
method for displaying data from a completed form us- 

40 ing the same feature extraction techniques to the 
form dictionary. A completed printed form is scanned 
and a hierarchical profile for the completed form is 
created. The completed form profile is compared with 
each of plural blank form profiles in a form dictionary 

45 until the completed form is identified as one of the 
blank forms in the dictionary. Upon identifying the 
completed form, the completed form profile and the 
blank form profile are compared and, based on the 
comparison, dissimilar image data from the complet- 

50 ed form profile is extracted and stored for display to 
an operator. Because only dissimilar data is displayed 
to an operator, the operator can more easily recog- 
nize what data is to be keyed-in. If desired, field iden- 
tifiers may also be displayed. 

55 In yet another aspect of the invention, a method 

for storing completed portions of a form includes the 
steps scanning a completed printed form, creating a 
hierarchical profile of the completed form using the 
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same feature extraction techniques as those used to 
create the form dictionary, comparing the completed 
form profile with stored blank form profiles, identify- 
ing a stored blank form profile as that of the complet- 
ed form profile based on the results of comparison. 5 
comparing the blank form profile with the completed 
form profile and extracting dissimilar data from the 
completed form profile, storing the extracted dissim- 
ilar data, and reassembling the completed form by 
displaying the identified blank form and overlaying io 
the dissimilar data at the appropriate locations within 
the form. 

This brief summary of the invention is provided so 
that the nature of the invention may be understood 
quickly. A fuller understanding may be obtained by 15 
reference to the following detailed description of the 
invention in connection with the appended drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 

Figure 1 is a representational view of a network 
system for capturing images from completed 
forms; 

Figure 2 is a block diagram of the data entry and 
retrieval system of the present invention; 25 
Figure 3 is a functional block diagram of a method 
for producing a blank form dictionary in the pres- 
ent invention; 

Figure 4a is an example of a blank form used with 

the present invention; 30 

Figure 4b is a topographical view of a blank form 

template used in feature extraction; 

Figure 4c is an illustration of a hierarchical form 

profile of vectorized data; 

Figure 5 is a flow chart describing the method for 35 

creating a blank form dictionary; 

Figure 6 is an example of a completed form used 

with the present invention; 

Figure7 is'B general block diagram of the process 

for identifying and for routing completed forms; 40 

Figure 8a is a topographical view of a completed 

form template of the completed form shown in 

Figure 6; 

Figure 8b, comprising 8b-1 and 8b-2, is an illus- 
tration of a computer-usable format of a vecto- 45 
rized version of the completed form shown in Fig- 
ure 6; 

Figure 9 is a flow chart describing the method for 
retrieving desired data from the completed form 
shown in Figure 6; so 
Figure 10a is a functional block diagram of a 
method for extracting desired data and field 
header information from completed forms; 
Figure 10b illustrates an example of extracting 
and displaying desired data from a completed 55 
form; 

Figure 11 is a general block diagram of the proc- 
ess for extracting and for displaying desired data 



from a completed form; 

Figure 12 illustrates a flow chart describing the 
method for extracting data from completed forms 
and displaying the extracted data with field head- 
er information to an operator in a second embodi- 
ment of the present invention; 
Figure 13 illustrates an example of recreating a 
completed form from extracted data and a stored 
corresponding blank form; 
Figure 14 is a general block diagram of the proc- 
ess for extracting data from a. completed form, 
storing an indicia of a corresponding blank fonm, 
and recreating the completed form; and 
Figure 15 illustrates a flow chart describing the 
operation for extracting data from a completed 
form, storing the extracted data and the identity 
of a blank form from the form dictionary, and re- 
creating the completed document by displaying 
the identified blank form superimposed with the 
extracted data in appropriate field locations of the 
form in a third embodiment of the present inven- 
tion. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

Figure 1 illustrates a network system for extract- 
ing and for storing desired data from completed 
forms. 

As shown in Figure 1. reference numeral 1 des- 
ignates a document image management system. 
Document image management system 1 includes 
document scanner 2 for scanning printed forms, in- 
cluding blank printed forms and completed printed 
forms. Document scanner 2 creates digital image 
data from scanned forms and outputs the image data 
to workstation 3. 

Workstation 3 includes computing equipment 
such as an IBM PC or PC-compatible computer 4. 
Workstation 3 further includes a local area network 
interface 6 which provides interface to the local area 
network, whereby workstation 3 can access image 
data files stored thereon. Workstation 3 can either 
store input image data created by document scanner 
2 or downput the image data to a file server (not 
shown) located on the local area network. Worksta- 
tion 3 also includes keyboard 8 and mouse 9 for al- 
lowing user designation of areas on display screen 5. 

As shown in Figure 2, PC 4 includes CPU 1 0 such 
as an 80386 processor which executes stored pro- 
gram instructions such as operator selected applica- 
tion programs that are stored on hard drive 10a. The 
document image data created by document scanner 
2 is received by PC 4 and, prior to processing. PC 4 
temporarily stores the image data in a temporary stor- 
age area, such as random access memory 10b 
(RAM). Upon storing the image data in RAM 10b. 
CPU 1 0 executes a feature extraction program stored 
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in hard drive 10a. CPU 10 processes the image data 
In accordance with the feature extraction program. 
The processed data is compared with forms stored in 
form dictionary 20 (to be discussed below). There- 
after, the result of comparison is displayed on display 5 
screen 5. 

Prior to utilizing the data retrieving process from 
workstation 3. workstation 3 must be initially equip- 
ped with a form dictionary. That is, workstation 3 must 
create a form dictionary of all forms used with the sys- io 
tern. This process of creating a form dictionary in- 
cludes scanning each type of blank form used in the ' 
data retrieving system and storing a hierarchical form 
profile of vectorized data in the form dictionary for 
each form. 15 

Figures 3-5 discuss in greater detail the manner 
by which form dictionary 20 Is created. 

Figure 3 is a functional block diagram illustrating 
the method for producing form dictionary 20. As 
shown in Figure 3, a plurality of blank form types (A, 20 
B, C, and D) are scanned by scanner 2 in order to in- 
put image data for processing. Since the method of 
producing a form dictionary is the same far each 
blank form type, the following description of the meth- 
od for constructing form dictionary 20 will be dis- 25 
cussed only with respect to blank form 11 shown in 
Figure 4a for the purposes of brevity. 

In Figure 4a» blank form 11 is Input by document 
scanner 2. Document scanner 2 produces Image data 
which is output to workstation 3. Upon receiving the 30 
image data of blank form 11, the image data is either 
temporarily stored to a file server on the local area 
network or stored In RAM 10b of PC 4. Once the en- 
tire image data of blank form 11 is received. PC 4 
processes the image data in accordance with a fea- 35 
ture extraction program, such as the feature extrac- 
tion technique disclosed in commonly assigned U.S. 
Patent Application Serial No. 07/873,012. the con- 
tents of w'hich aire incorporated herein by reference, 
or any other suitable type of feature extraction proc- 40 
ess which creates a hierarchical profile. The stored 
feature extraction program is retrieved from hard 
drive 10a and stored in RAM 10b. Once the program 
is stored, CPU 10 executes the process steps from 
RAM 1 0b for execution. 45 

Upon initiation of the program, the feature extrac- 
tion program designates data blocks within form 11 in 
order to form a blank form template of blank form 11 , 
as shown in Figure 4b, Each data block on the blank 
form template is designated by an index number, (x. so 
y) coordinates, a length measurement and a width 
measurement of the block. For example, the heading 
"Canon Information Systems: INVOICE" in Figure 4a 
is designated In Figure 4b as block 1.1 for "Canon In- 
formation Systems" and as block 1 .2 for "INVOICE". 55 

Once the template has been created, a computer- 
usable format Is generated from the template by the 
feature extraction program to create a hierarchical 



form profile of vectorized data. As shown in Figure 4c. 
the computer-usable format 11a of blank form 11 is 
created by identifying the block location with (x. y) co- 
ordinates, a length measurement, a width measure- 
ment, and an attribute of the block. The attribute in- 
formation relates to the type of information in the 
block, such as text, graphics, etc. 

In some instances, a block, such as block 1.11. 
which includes both routing information and Identifi- 
cation Information. Is divided into sub-blocks. For ex- 
ample, as shown In Figure 4b. block 1.11 Is divided 
into several sub-blocks, such as block 1.11.3.1 which 
designates a customer reference number. Block 
1.11.3.1 Is defined in Figure 4c as x = 45. y = 198. 1 
= 13. w = 435, and att = 1. 

Reverting to Figure 4a, hierarchical form profile 
11a has been created and stored as vectorized data, 
several exemplary completed forms of blank form 11 
are scanned and input Into the system. Each exem- 
plary completed form 12, 13, and 14 is processed by 
the feature extraction program in order to create a 
completed form profile of vectorized data for each 
completed form. Upon creating completed form pro- 
files 12a, 13a, and 14a, completed form profiles 12a, 
13a, and 14a are compared and invariant elements 
are extracted. The Invariant elements are summa- 
rized to create completed form profile lib. Blankform 
profile 11a of blank form 11 and completed fomn pro- 
file 11b are correlated to produce a blankform profile 
11c. Blank form profile 11c is stored as vectorized 
data In form dictionary 20. In the case that more than 
one blank form is used in the system, the above proc- 
ess is repeated for each type of blank form used in the 
system. 

In the case that the system utilizes plural blank 
forms, the resulting form profiles (corresponding to 
form profile 11c) are post-processed to eliminate any 
ambiguities between form profiles. Once the ambigu- 
ities are eliminated, each blank form profile is stored 
in form dictionary 20. Upon completing blank form 
dictionary 20, the data entry system is prepared to 
Identify and to store data from completed forms. 

The flow chart in Figure 5 illustrates In greater de- 
tail the process of constructing form dictionary 20. In 
step S501, a blank form Is scanned and image data 
is created and sent to workstation 3. The Image data 
is processed by PC 4 using the feature extraction pro- 
gram stored on the hard drive 10a. The feature ex- 
traction program creates a template of blank form 11 
from which computer-usable vectorized data for a 
blank form profile can be created. As previously dis- 
cussed, with respect to Figure 4c. the hierarchical 
form profile includes vectorized data which defines 
the block by an index number, (x, y) coordinates, a 
length measurement, a width measurement, and at- 
tribute information which defines the type of block, 
i.e., text table, graphics, etc. 

After creating the hierarchical form profile in step 
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S502. a piurality of exemplary completed forms 12, 
13. and 14 of blank form 11 are scanned by scanner 
2 in step S504. In step S505, the feature extraction 
program operates to create hierarchical fonm profiles 
of vectorized data for each completed form. In step 5 
S506, each completed form profile is compared and 
invariant elements In each completed form profile are 
extracted. The extracted invariant elements are cor- 
related to create an exemplary completed form profile 
11b. 10 

Blank form profile 11a and exemplary completed 
form profile 1 1 b are correlated to form blank form pro- 
file 11c in step S508. In the case that more than one 
blank form is used in a data system, flow proceeds to 
step S509 in which steps S501-S508 are repeated for 15 
each blank form used in the system. 

In step S510, each different form profile is post- 
processed by removing any ambiguities between all 
form profiles. The resulting disambiguated hierarchi- 
cal form profile is stored In form dictionary 20 in step 20 
S511. Once form dictionary 20 is complete, form dic- 
tionary 20 contains all the information required to dis- 
tinguish between forms. If two different forms have 
the same hierarchical structure, each form is subject- 
ed to OCR processing in order to identify a feature of 25 
the form which in most cases will be different from 
any other form, such as a form number located at the 
bottom of the form. This process is usually a post- 
processing step which is operator assisted in order to 
eliminate similar features in different forms. Upon 30 
completing the foregoing steps, form dictionary 20 is 
complete. 

Once form dictionary 20 has been created, com- 
pleted forms can be identified and desired data ex- 
tracted from completed forms. For example, Figure 6 35 
illustrates an example of completed form 40. As 
shown, only a small portion of form 40 includes de- 
sired data^That Is. it is desirable, and more efficient 
to extract only^information regarding certain items, 
such as invoice number, dates, description of items, 40 
quantity, and prices. The remaining Information on 
the form is deemed to be useless data and, therefore, 
it is not extracted. 

In order to obtain and to limit the amount of data 
extracted, completed form 40 is subjected to an Iden- 45 
tif ication process. In accordance with the identity of 
the completed document, a data extraction process 
extracts preselected data fields from the completed 
form. 

Figure 7 is a general block diagram illustrating so 
the process for identifying, extracting and routing the 
identified form. As shown in Figure 7. image data of 
completed form 40 Is processed to create a hierarch- 
ical profile of vectorized data by hierarchical profile 
creator 41 using the same feature extraction method 55 
for creating form dictionary 20. The created hierarch- 
ical profile is compared with hierarchical profiles of 
forms stored in form dictionary 20 by hierarchical pro- 



file comparator 42. Hierarchical profile comparator 42 
identifies a corresponding form in form dictionary 20 
and, in accordance with the identity of that form, im- 
age data of completed form 40 is routed to an appro- 
priate processing station by form router 45. 

In more detail, to extract desired data, the feature 
extraction program forms a completed form template 
of completed form 40 as shown in Figure 8a. The tem- 
plate facilitates the process for creating the vecto- 
rized data which defines the completed form In Fig- 
ures 8b-1 and 8b-2. Once the completed form Is Iden- 
tified. PC 4 determines, in accordance with the iden- 
tity of the completed form, where to route the complet- 
ed form for further processing. 

The method for identifying a completed form and 
extracting desired data will be discussed in greater 
detail with respect to the flow chart illustrated in Fig- 
ure 9. 

In step S901, completed form 40 is scanned and 
image data of completed form 40 is output to worksta- 
tion 3 for processing. PC 4 in workstation 3 processes 
the image data of completed form 40 in accordance 
with a stored feature extraction technique. The fea- 
ture extraction program operates to divide completed 
form 40 into blocks and creates a hierarchical form 
profile in step S902. The completed form profile in- 
cludes vectorized data defining the block layout of 
completed form 40. The completed form profile 
shown in Figures 8b-1 and 8b-2 is compared with 
blank form profiles in form dictionary 20 In step S903. 
Invariant elements between the completed form pro- 
file and the form profiles in form dictionary 20 are 
compared in step S903. 

In step S904. CPU 10 determines whether a pre- 
determined level of invariant elements has been iden- 
tified between the completed form profile and at least 
one blank form profile in form dictionary 20. In the 
case that the number of invariant elements does not 
reach the predetermined level, in step S905 the data 
entry operator is alerted that the form has not been 
identified and the completed form is rejected for man- 
ual identification by the operator or the unidentified 
form can be subjected to OCR. 

In step S906, the form is identified by either the 
operator or by CPU 10. That is, CPU 10 selects the 
blank form profile having the greatest number of In- 
variant elements in common with the completed form 
profile. This blank form profile is determined to be the 
corresponding form to the completed form. 

Upon identifying the form, the completed form is 
routed in accordance with its particular identity In 
step S907. For example, a completed form may be 
routed to the personnel department if the form is an 
employment application; to the claim department if 
the form is an insurance claim form, etc. On the other 
hand, the completed form may be routed for further 
processing, such as extracting particular data fields 
from certain portions of the form and subjecting those 
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extracted fields to optical character recognition. 

In addition to routing particulars, completed 
documents may be submitted for further processing 
since some completed documents contain several 
types of data. For example, a completed document 
may be a hybrid document which contains image, 
text graphs, etc., and, therefore, the text portion is 
subjected to a different recognition process than an 
image. That Is, text is generally subjected to optical 
character recognition while the image is generally 
subjected to ha If- tone processing. 

A second embodiment of the present Invention is 
to be described with reference to Figures 10a and 
10b. The invention to be described hereinbelow util- 
izes form dictionary 20 which has been previously de- 
scribed above and, therefore, details for creating form 
dictionary 20 will be omitted for the purpose of brev- 
ity. 

Referring to Figure 10a, there is illustrated a 
functional block diagram of a method for extracting 
predesignated "data fields" and for displaying only 
those data fields to an operator. As shown in Figure 
1 0a, a blankform is scanned and a hierarchicaf profile 
of the form is created by feature extraction. 

An operator, during a preproduction phase of cre- 
ating a hierarchical profile of a blank form, can des- 
ignate certain data fields to be extracted from among 
other data fields in a completed fonm. In the prepro- 
duction phase, data fields within a blank form are des- 
ignated by an operator for extracting data therein. The 
designation of each data field is stored with the blank 
form profile in form dictionary 20. Once the prepro- 
duction of the hierarchical form profile has been com- 
pleted, completed forms are processed. 

Upon scanning a completed form, a hierarchical 
profile of the completed form is created by feature ex- 
traction techniques. The completed profile is com- 
pared with_all hierarchical form profiles in form dic- 
tionary 20. Assuming that the completed form is iden- 
tified as one of the blank forms in the form dictionary, 
a blank form profile and the completed form profile 
are compared and dissimilar elements in the complet- 
ed form profile are extracted. 

In accordance with predesignated information, 
the dissimilar elements and the predesignated fields 
are compared once again. The dissimilar elements 
falling within the preselected fields are stored and the 
remaining dissimilar elements are discarded. The 
data within the preselected fields are either subjected 
to further processing or displayed to the data entry 
operator for manual input 

For example, in Figure 10b, there is illustrated a 
completed form which Includes, among other items, 
invoice number, origin and destination information, 
quantity information, and description Information. Be- 
cause these data fields were predesignated during 
the preproduction definition stage, data within each 
field is extracted from the completed form. According- 



ly, in order to remove all extraneous items from the 
form, the form is input into the data entry and retrieval 
system and. in accordance with the present invention, 
only the desired infonmation is displayed to the data 
5 entry operator for manual-key input As a result a 
data entry operator can reduce the amount of time in 
reviewing a completed form for useful data. 

Thus, as shown with respect to Figure 11. image 
data of completed form 40 is processed to create a hi- 

10 erarchical profile by hierarchical profile creator 50, 
The profile is compared and identified as a hierarch- 
ical form profile in form dictionary 20 by hierarchical 
profile comparator 51. Hierarchical profile compara- 
tor 51 compares the identified form from dictionary 

15 20 with the hierarchical profile of completed form 40. 
Dissimilar data with respective field header informa- 
tion is extracted from the completed form by dissim- 
ilar data and field header extractor 54. The extracted 
data either can be stored or can be displayed imme- 

20 d lately to an operator. 

Figure 12 discusses in greater detail the process 
of extracting desired fields from completed forms. In 
step SI 201, a completed document is scanned and 
the image data is subjected to feature extraction. In 

25 step SI 202, the feature extraction program creates a 
hierarchical form profile for the completed form using 
the same feature extraction techniques used to cre- 
ate form dictionary 20, In step S1204. a completed 
form profile is compared with each of the blank form 

30 profiles in the form dictionary. In step S1205, CPU 10 
determines whether there is a match between one of 
the blank form profiles and the completed form pro- 
file. 

In the case that a match has not been deter- 
35 mined, in step S1 206 the data entry operator is alert- 
ed that a completed form cannot be identified and the 
completed form is either manually Identified by the 
data entry operator or the completed form is routed 
for further processing, such as optical character rec- 
40 ognition. In step 1207. the completed form is either 
identified by the operator or by CPU 10. In step 
S1208, if the completed form profile is matched with 
a blank form profile In the form dictionary, the match- 
ed blank form profile is compared with the completed 
45 form profile. In step S1209. dissimilar elements from 
the completed form profile are extracted with respec- 
tive field header information. The extracted Informa- 
tion and field header information are stored In mem- 
ory. 

50 As described above with respect to Figure 10a, 

during the preproduction definition of the blankform 
profile, it may be desirable to predesignate certain 
fields from which data can be extracted. In this case, 
the stored data and header information from the conv 

55 pleted document profile are compared with the pre- 
designated listing of data fields. Any data fields des- 
ignated by the predesignated listing are displayed 
with respective field header information to an opera- 
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tor for manual input. Otherwise, if no predesignated 
listing exists, all extracted data with respective field 
header information is displayed to the data entry op- 
erator for manual input. 

A third embodiment of the present invention is de- 5 
scribed with reference to Figures 13-15. The inven- 
tion to be described hereinbelow utilizes form diction- 
ary 20 which has been described above. 

In the third embodiment of the present invention, 
it is possible to display an entire completed form with- io 
out having to store all the image data from the scan- 
ned completed form. For example, as shown in Fig- 
ures 13 and 14. completed form 40 is input into the 
data entry and retrieval system. Using the same fea- 
ture extraction techniques used for creating form die- is 
tionary 20, image data of completed form 40 is proc- 
essed into a hierarchical form profile consisting of 
vectorized data and attribute data by hierarchical pro- 
file creator 60. The hierarchical profile of completed 
form 40 is compared with hierarchical profiles of 20 
forms stored in form dictionary 20 by hierarchical form 
comparator 61 . Hierarchical form comparator 61 iden- 
tifies the completed form as one of the forms in form 
dictionary 20. The corresponding form and completed 
form are compared and dissimilar data with respec- 25 
tive field header information are extracted from the 
completed form profile by extractor 62. The extracted 
dissimilar data with respective header information is 
stored in form memory. In addition to comparing and 
to identifying, hierarchical profile comparator 61 30 
sends an indicia of the corresponding form from form 
dictionary 20 to blank fonm identifier 64. Blank form 
identifier 64 stores the form indicia with the extracted 
data in form memory. 

In the case that a data entry operator wishes to 35 
retrieve completed form 40 from memory, CPU 10 re- 
trieves the corresponding form from form dictionary 
20. CPU 10 draws the blank form in accordance with 
the stored ve'ctorized data. Upon completing the 
drawn blank form, extracted data is superimposed 40 
into appropriate fields within the blank form. As shown 
at 1 30 in Figure 14, data superimposed into the blank 
form may be skewed left, right, up or down as a result 
of scanning, or magnification variations caused by 
copying of blank forms. To correct skewing, a skewing 45 
correction program can be applied to the data before 
displaying the completed form. 

Referring to Figure 15. there is illustrated a flow 
diagram describing a method for identifying a conv 
pleted form, extracting data from the completed form. so 
storing the extracted data from the completed form 
with an indicia of a corresponding blank form In the 
form dictionary, and reassembling the completed 
form by retrieving the corresponding blank form from 
memory and superimposing the extracted data from ss 
the completed form into the appropriate fields within 
the blank form. 

In more detail, in step S1501, a completed form 



is scanned and the image data is output to worksta- 
tion 3 for processing. In step SI 502, the feature ex- 
traction program in workstation 3 creates a hierarch- 
ical profile of the completed form. In step SI 503. the 
completed form profile is compared with each blank 
form profile in the form dictionary. CPU 10 determi- 
nes whether a match between the completed profile 
and a blank form profile in the form dictionary has 
been identified in step 31 505. 

In the case that CPU 1 0 determines that the com- 
pleted form does not match a blank form in form dic- 
tionary 20, then in step 81506 the data entry operator 
is alerted that no match has been identified for the 
completed form and the unidentified form is rejected 
for manual input. In step 1507, the completed fonm is 
either identified by the operator or by CPU 10. 

in step 31 508, the completed form profile and the 
matched blank form profile are compared and dissim- 
ilar elements from the completed blank profile are ex- 
tracted. The extracted data from the completed form 
profile is stored with the field header information from 
which the data was extracted in step SI 509. In addi- 
tion to storing the extracted data and respective field 
header information, an indicia of the corresponding 
blank form in the form dictionary is stored. 

In step 31510, the completed form is reassem- 
bled by retrieving from memory vectorized data of the 
corresponding blank form in accordance with the stor- 
ed indicia. The blank form is drawn line by line in ac- 
cordance with the vectorized information. After the 
blank form is completely drawn, extracted data is su- 
perimposed with respective field header information 
into appropriate locations in the blank form. In step 
S 1 51 1 , the reassembled completed form is corrected 
for skew and the result is displayed to the operator. 



Claims 

1. A method for creating a form dictionary, the meth- 
od comprising the steps of: 

scanning a first blank form; 

creating a hierarchical profile of the first 
blank form; 

scanning a plurality of exemplary complet- 
ed forms of the first blank form; 

creating a hierarchical profile for each ex- 
emplary completed form; 

comparing and extracting invariant ele- 
ments between each completed form hierarchi- 
cal profile; 

correlating the blank form hierarchical pro- 
file with extracted invariant elements to create a 
first enhanced blank fonm hierarchical profile; 
and 

storing the first enhanced blank form hier- 
archical profile in the form dictionary. 
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8. A method according to Claim 5,6 or 7. further 
comprising the step of storing predetermined 
data from the extracted portions in accordance 
with the identity of the corresponding hierarchical 

5 form profile, 

9. A method according to Claim 5,6, 7 or 8. further 
comprising the step of displaying selected data 
from the extracted portions in accordance with 

10 the identity of the corresponding hierarchical 

form profile. 

1 0. A method according to Claim 5,6, 7, 8 or 9, further 
comprising the step of optical character recogni- 

15 tion of selected data from the extracted portions 

in accordance with the identity of the correspond- 
ing hierarchical form profile. 



2. A method according to Claim 1, further compris- 
ing the steps of: 

scanning a second blank form; 

creating a hierarchical profile of the sec- 
ond blank form; 

scanning a plurality of exemplary complet- 
ed forms of the second blank form; 

creating a hierarchical profile of each ex- 
emplary completed form; 

comparing and extracting invariant ele- 
ments between each completed form hierarchi- 
cal profile; 

correlating the second blank form hier- 
archical profile with the extracted invariant ele- 
ments to create a second enhanced blank form 
hierarchical profile; and 

storing the second enhanced blank form 
hierarchical profile in the form dictionary. 

3. A method according to Claim 2, 2o 
further comprising the step of disambiguating the 
first enhanced blank hierarchical profile and the 
second enhanced blank hierarchical profire. 

4. A method according to Claim 3, wherein the step 
of disambiguating comprises the steps of: 

comparing and extracting invariant ele- 
ments between the first enhanced hierarchical 
profile and the second enhanced hierarchical 
profile; 

discarding the invariant elements; and 
storing disambiguated hierarchical profiles of the 
first and second blank forms in the form diction- 
ary. 

35 

5. A method for identifying a completed form using 
a form dictionary of hierarchical form profiles, the 
method comprising the steps of: 

scanning a completed form; 

creating a hierarchical profile of the com- 40 
pleted form; 

comparing the hierarchical profile of the 
completed fonm with hierarchical form profiles in 
the form dictionary; and 

identifying, in accordance with the com- 45 
paring step, one of the hierarchical form profiles 
as corresponding to the completed fonm. 

6. A method according to Claim 5. further compris- 
ing the step of extracting portions of the hierarch- 50 
ical profile of the completed form which differ 
from the identified hierarchical form profile. 

7. A method according to Claim 6, further compris- 
ing the step of routing the extracted portions for 55 
processing in accordance with the identity of the 
con-esponding hierarchical form profile. 



11. A method for displaying completed portions of a 
form, the method comprising the steps of: 
scanning a completed form; 
creating a hierarchical profile of the com- 
pleted form; 

comparing the hierarchical profile of the 



13. A method according to Claim 12, wherein the step 
of extracting includes the step of extracting re- 
spective field header information for the extract- 
ed portions of the hierarchical profile of the com- 
pleted form and displaying extracted portions 
with the respective header information. 

1 4. A method according to Claim 1 1 , 1 2 or 1 3. further 
comprising a step of identifying the scanned 
completed form by comparing the completed 
form with blank forms stored in a form dictionary. 

15. A method according to Claim 11. 12, 13or 14. fur- 
ther comprising the steps of: 

creating a hierarchical profile of a blank 

form; 

scanning a plurality of corresponding ex- 
emplary completed forms; 

creating a hierarchical profile for each ex- 
emplary completed fonm; 

comparing and extracting invariant ele- 
ments from completed form hierarchical profiles; 

correlating the blank form profile and ex- 
tracted invariant elements from the completed 



25 completed form with a stored hierarchical profile 

of a blank form; and 

displaying dissimilar portions of the com- 
pleted hierarchical profile. 

30 1 2. A method according to Clainri 1 1 . further compris- 
ing the step of extracting portions of the hierarch- 
ical profile of the completed form which differ 
from the hierarchical profile of the blank form. 
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form profiles to create an enhanced blank form 
hierarchical profile; and 

storing the enhanced blank form hierarch- 
ical profile in the form dictionary. 

5 

16. A method according to any of claims 11 to 15. 
wherein the hierarchical profile of the completed 
form and the blank form each include vectorized 
data and attribute data. 

10 

1 7. A method for routing completed forms, the meth- 
od comprising the steps of: 

scanning a plurality of different blank 

forms; 

creating and storing hierarchical profiles is 
of each scanned blank form; 

scanning a completed form; 

creating a hierarchical profile of the com- 
pleted form; 

comparing the hierarchical profile of the 20 
completed form with stored hierarchical profiles; 

identifying one of the stored hierarchical 
profiles as corresponding to the completed form 
hierarchical profile in accordance with the result 
of comparison; and 25 

routing the completed form for processing 
In accordance with the identity of the correspond- 
ing stored hierarchical profile. 

18. Amethod according toClaim 17, wherein the step 30 
of creating a hierarchical profile of a blank form 
includes: 

creating hierarchical profiles of blank 

forms; 

scanning a plurality of corresponding ex- 35 
emplary completed forms for each blank form; 

creating a hierarchical profile for each ex- 
emplary completed form; 

'""conrrparing and extracting invariant ele- 
ments from completed form hierarchical profiles; 40 
and 

correlating each blank form profile with ex- 
tracted invariant elements of corresponding com- 
pleted form profiles to create enhanced blank 
form hierarchical profiles. 45 

19. A method according to Claim 17 or 18, wherein 
the hierarchical profile of the completed form and 
the blank form each include vectorized data and 
attribute data. so 

20. A method for storing and retrieving completed 
portions of a form, the method comprising the 
steps of: 

scanning a plurality of different blank 55 

forms; 

creating and storing hierarchical profiles 
of each scanned blank form; 



scanning a completed form; 

creating a hierarchical profile of the com- 
pleted form; 

comparing the hierarchical profile of the 
completed form with stored hierarchical profiles; 

identifying a stored hierarchical profile in 
accordance with the result of comparison as cor- 
responding to the completed form hierarchical 
profile; 

extracting completed portions of the com- 
pleted form respective header information, the 
extracted completed portions differing from the 
identified stored hierarchical profile; and 

displaying the extracted completed por- 
tions and respective header information. 

21. Amethod according to Claim 20, wherein the step 
of creating a hierarchical profile of a blank form 
includes: 

creating hierarchical profiles of a blank 

form; 

scanning a plurality of corresponding- ex- 
emplary completed forms for each blank form; 

creating a hierarchical profile for each ex- 
emplary completed form; 

comparing and extracting invariant ele- 
ments from completed form hierarchical profiles; 
and 

correlating the blank form profile with ex- 
tracted invariant elements of corresponding com- 
pleted form profiles to create enhanced blank 
form hierarchical profiles. 

22. A method according to Claim 20 or 21. wherein 
the hierarchical profile of the completed form and 
the blank form each include vectorized data and 
attribute data. 

23. A method according to Claim 20. 21 or 22. where- 
in the storing step includes storing an indicia of 
the identified stored hierarchical profile. 

24. Amethod according to Claim 20, 21 . 22 or 23, fur- 
ther comprising the step of displaying the com- 
pleted form by displaying the corresponding 
blank form which has been retrieved in accor- 
dance with the stored indicia and superimposing 
the extracted completed portions from the com- 
pleted form with respective header information 
into appropriate locations of the blank form. 

25. An apparatus for identifying a completed form us- 
ing a form dictionary of hierarchical form profiles, 
the apparatus comprising: 

scanning means for scanning a completed 

form; 

profile creating means for creating a hier- 
archical profile of the completed form; 
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tionary. 

32. An apparatus according to Claim 30 or 31 . where- 
in the extracting means includes means for ex- 
5 tracting respective field header information with 

the extracted portions of the hierarchical profile 
of the completed form and the displaying means 
displays the extracted portions with the respec- 
tive header information. 



comparison means for comparing the hier- 
archical profile of the completed form with hier- 
archical form profiles in the form dictionary; 

identifying means for identifying, in accor- 
dance with the result of comparing, one of the hi- 
erarchical form profiles as corresponding to the 
completed form; and 

extracting means for extracting portions of 
the hierarchical profile of the completed form 
which differ from the identified hierarchical form 
profile. 

26. An apparatus according to Claim 25, further conn- 
prising routing means for routing the extracted 
portions for processing in accordance with the is 
identity of the corresponding hierarchical form 
profile. 

27. An apparatus according to Claim 25 or 26, further 
comprising storing means for storing predeter- 20 
mined data from the extracted portions in accor- 
dance with the identity of the corresponding hier- 
archical form profile. 

28. An apparatus according to Claim 25, 26 or 27, 25 
further comprising displaying means for display- 
ing selected data from the extracted portions in 
accordance with the identity of the corresponding 
hierarchical form profile. 

30 

29. An apparatus according to Claim 25, 26, 27 or 28, 
further comprising character recognition means 
for recognizing selected data from the extracted 
portions in accordance with the identity of the 
corresponding hierarchical form profile. 35 

30. An apparatus for displaying completed portions 
of a form, the apparatus comprising: 

" scanning means for scanning a completed 
form; 40 

profile creating means for creating a hier- 
archical profile of the completed form; 

comparison means for comparing the hier- 
archical profile of the completed form with a hier- 
archical profile of a blank form stored in a form di- 45 
rectory; 

extracting means for extracting portions of 
the hierarchical profile of the completed form 
which differ from the hierarchical profile of the 
blank form; and so 

displaying means for displaying the ex- 
tracted portions of the completed hierarchical 
profile. 

31. An apparatus according to Claim 30, further com- 55 
prising identifying means for identifying the 
scanned completed form by comparing the com- 
pleted form to blank fonms stored in the form dic- 



33. An apparatus according to Claim 30. 31 or 32, 
further comprising: 

scanning means for scanning a plurality of 
corresponding exemplary completed forms; 

form profile creating means for creating a 
hierarchical profile of a blank form and for creat- 
ing a hierarchical profile for each exemplary com- 
pleted form; 

comparing and extracting means for com- 
paring and extracting invariant elements from 
completed form hierarchical profiles; and 

correlating means for correlating the blank 
form profile and extracted invariant elements 
from the completed form profiles to create an en- 
hanced blank form hierarchical profile. 

34. An apparatus according to Claim 30, 31 , 32 or 33. 
wherein the hierarchical profile of the completed 
form and the blank form each include vectorized 
data and attribute data. 

35. An apparatus for identifying completed forms, 
the apparatus comprising: 

memory means for storing hierarchical 
profiles for a plurality of different blank forms; 
scanning means for scanning a completed 

form; 

creating means for creating a hierarchical 
profile of the completed form; 

comparison means for comparing the hier- 
archical profile of the completed form with stored 
hierarchical profiles; 

identifying means for identifying one of 
the stored hierarchical profiles as corresponding 
to the completed form hierarchical profile in ac- 
cordance with the result of comparison; and 

routing means for routing the completed 
form for processing in accordance with the iden- 
tity of the corresponding stored hierarchical pro- 
file. 

36. A completed form identifier, comprising: 

a scanner for scanning a completed form; 

a hierarchical profile creator for creating a 
hierarchical profile of the scanned completed 
form; 

a comparator for comparing.the hierarchi- 
cal profile of the scanned form with hierarchical 
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profiles of various blank forms stored in a form 
dictionary; 

a form identifier for identifying the com- 
pleted form based on the comparison result of the 
comparator; and 5 

a form router for routing the form for fur- 
ther processing based on its identity. 

37. An apparatus for displaying extracted data from 

a completed form, comprising: io 
a scanner for scanning a completed form; 
a hierarchical profile creator for creating a 
hierarchical prof ile of the scanned form; 

comparator for comparing the hierarchical 
profile of the completed form to hierarchial pro- is 
files of various blank forms stored in a form dic- 
tionary; 

form Identifier for identifying the complet- 
ed form and for extracting data from the complet- 
ed form based on its identity; and 20 

a display screen for displaying the extract- 
ed data. 

38. An apparatus for displaying a completed form, 
comprising: 25 

a scanner for scanning a completed form; 

a hierarchical profile creator for creating a 
hierarchical profile of vectorized data corre- 
sponding to the scanned form; 

a comparator for comparing the hierarchi- 30 
cal profile of the completed form to hierarchical 
profiles of vectorized data corresponding to blank 
forms in a form dictionary and for locating a blank 
form hierarchical profile matching the completed 
form profile; 35 

data extractor for extracting data identifi- 
ers and data which is dissimilar to the matching 
blank form hierarchial structure from the hier- 
archical profile of the completed form; 

a memory for storing the extracted data, 40 
data identifiers, and an indicia of the matching 
blank form; and 

a screen display for displaying the extract- 
ed data superimposed on the matching blank 
form, wherein the matching blank form is drawn 45 
in accordance with vectorized data in the hier- 
archical profile, and wherein the extracted data is 
superimposed into appropriate locations in the 
blank form in accordance with the extracted data 
identifiers. so 

39. A form identifier for identifying a completed form 
using a form dictionary of hierarchical form pro- 
files, the apparatus comprising: 

a processing unit including a computer for 55 
executing stored program steps; 

a memory for storing the form dictionary of 
hierarchical form profiles and process steps for 



execution by the processing unit; and 

a scanner for scanning completed forms, 
wherein the process steps stored in the 
memory include steps to create a hierarchial pro- 
file of a scanned completed form, to compare the 
hierarchical profile of the completed form to hier- 
archical form profiles in the memory, to match the 
hierarchical form profile of the completed form to 
one of the hierarchical form profiles in the menv 
ory, and to form-process the completed form 
based on an identity of the matched hierarchical 
form profile. 

40. A fonm identifier according to Claim 39, wherein 
the process step of form-processing includes 
routing the completed form to an appropriate key- 
input operator. 

41. A form identifier according to Claim 39, wherein 
the process step of form-processing includes ex- 
tracting invariant elements from the completed 
form hierarchical profile based on a comparison 
to the matched hierarchical form profile and dis- 
playing the extracted invariant elements to a key- 
input operator. 

42. An apparatus for automated scanning and dis- 
play of documents, wherein areas of difference 
relative to a pre-defined template document are 
preferentially displayed. 

43. An apparatus for the storage and retrieval of 
document images, wherein a document to be 
stored is compared with a template image, and 
areas of difference only are stored for retrieval. 

44. An apparatus according to claim 43. wherein for 
image retrieval, stored difference information is 
combined with template information to synthe- 
size a retrieval document image. 

45. An apparatus according to any of claims 42. 43 
or 44, further comprising means for automatically 
identifying one template document by comparing 
an input document with a plurality of candidate 
templates. 

46. An apparatus for storing document templates, 
wherein stored template definitions are en- 
hanced so as to emphasize features which are in- 
variant between a plurality of sample documents 
having the relevant template. 

47. An apparatus for storing a plurality of document 
templates for recognition, wherein the templates 
are enhanced so as to de-emphasize features 
common to different templates. 
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1 nontext index = 1 x=25 y=22 1=621 w=422 att = 40 

1.1 index = l x=30 y=37 1 =22 w=270 att = 1 

x=30 y=37 1 =22 w=270 alt = 1 

1 .2 index = 2 x=31 1 y=39 1=15 w=94 att = 1 

x=31 1 y=39 1 =1 5 w=94 att = 1 

1.3 index = 3 x=150y=69 1=10 w=125 att = 1 

x=150 y=69 1=10 w=125 att = 1 

1 .4 index = 4 x=378 y=69 1 =10 w=40 att = 1 

x=378 y=69 1=10 w=40 att = 1 

1.5 index = 5x=284y=255 1 = 10 w=134 att = 1 

x=284 y=255 1=10w=134 att = 1 

1 .6 index = 9 x=350 y=557 1 =1 0 w=57 att = 1 

x=350 y=557 1=10 w=57 att = 1 

1 .7 index = 1 0 x=351 y=583 1=10 w=52 att = 1 

x=351 y=583 1 = 10 w=52 att = 1 

1.8 index = 11 x=352y=610 1 = 10 w=30 att = 1 

x=352 y=81 0 1=10 w=30 att = 1 

1 .9 nontext index = 1 x=424 y=37 1 =47 w=92 att = 40 

1.9.1 nontext index = 0 x=436 y=44 1=10 w=56 att = 1000 

1 .9.1 .1 nontext index = 1 x=436 y=44 1 =1 0 w=56 att = 81 

1.9.2 nontext index = 0 x=436 y=66 1=10 w=44 att = 1000 
1.9.2.1 index = 2 x=436 y=66 1=10w=44att= 1 

x=436 y=D6 1 = 10 w=44 aU = 1 

1.10 nontext index = 2 x=278 y=59 1=25 w=1 att =40 

1.10.1 nontext index = 1 x=284 y=67 1=10 v/=23 att = 4003 

1.11 nontext index = 3x=41 y=86 1=154 w=475 att = 2 

1.11.1 index = 20 x=47 y =99 1 =84 w=201 att = 1 

x=47y=99 1=13 w=2G1 att = 1 
x=47y=117 1=14 w=1 17 att = 1 
x=47y=136 1=9 w=115att= 1 
x=47y=153 1=11 w=131 att = 1 
x=47y=171 1 = 12 w=104 att = 1 

1.11.2 index = 23 x= 1 68 y=201 1 =30 w=1 01 att = 1 

x=169y=201 1 = 13 w=100 att =1 
x=168 y=218 1 = 13 w=53 att = 1 

1 . 1 1 .3 index = 24 x=47 y=202 1 =27 w=99 att =1 

x=47y=202 1 = 10 v/=99 att = 1 
x=50 y-21 9 1=10 w=45 att = 1 

1.11.4 index = 25 x=281 y=99 1=9 w=19 att = 1 

x=281 y=99 1=9 w=19 att = 1 

1.11.5 index = 27x=308y=101 1=47 w=107 att = 1 

x=308 y=101 1=47 w=95 att = 1 
x=308 y =118 1=13 w=107 att = 1 
x=303 y=137 1=11 w=98 att = 1 

1.11.6 index = 28x=282y=170 1=11 w=116att= 1 

x=282 y=170 1=11 w=116 att = 1 

1.11.7 index = 29 x=303 y=202 1=10w=98att = 1 

x=303 y=202 1 = 10w=98 att = 1 

1.11.8 index = 30 x=303y=221 1=10w=55att=l 

x=303 y=221 1 = 10 w=55 att = 1 

1.11.9 index = 31 x=454 y=203 1=9 v/=27 att = 1 

x=454 y=203 1=9 w=27 att = 1 

1.11.10 index = 32 x=433 y=22 1 =9 w=39 att= 1 

x=433 y=222 1 =9 w=39 att = 1 

1.12 nontext index = 4 x=424 y =245 1 =24 w=92 att = 40 

1 .12.1 nontext index = 1 x=462 y=253 1 = 11 \v=46 att = 81 
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1 x=51 y=288 1 = 12 w=19 att = 1 
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4 x=365 y=281 1 =22 w=30 att = 
=365 y=281 1 =22 w=30 att = 1 
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5 x=434 y=281 1=21 w=52 att = 
=434 y=281 1=21 w=52 att = 1 
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index = 
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6 x=51 y=315 1 = 10 w=6 att = 1 
=51 y=31 5 1=10 w=6 att = 1 
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=90 y=313 1=11 w=43 att = 1 
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X: 


8 x=173 y=314 1 = 12 w=110 att ^ 
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= 1 
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.13.9 


index = 


9 x=376 y=315 1=10 w=35 att = 


1 



x=376 y=315 1=10 w=35 att = 1 
1.13.10index = 10 x=454 y=316 1 = 12 w=45 att = 1 
x=454 y=316 1 = 12 w=45 att= 1 

1.13.11 index = 11 x=49 y=333 1=23 w=11 att = 1 

x=49 y=333 1 =23 w=1 1 att = 1 

1.13.12 index = 1 2 x=89 y=333 1 =23 w=43 att = 1 

x=89 y=333 1 =23 w=43 att = 1 
1.13.13index = 13 x=172 y=333 1=24 w=4 att = 1 
x=172 y=333 1=24 w=84 att = 1 

1.13.14 index = 1 4 x=368 y=334 1 =23 w=44 att = 1 

x=368 y=334 1=23 w=44 att = 1 

1 . 1 3. 1 5 index = 1 5 x=454 y=348 1 =9 w=45 att = 1 

x=454 y=348 1 =9 w=45 att = 1 

1.1 3. 16 index = 16 x=420 y=549 1=73 w=90 att = 1 

x=420 y=549 1=73 w=90 att = 1 
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(m) Form identification and processing system 

(57) A method and an apparatus for identifying 
completed forms includes scanning a plurality 
of different blank forms, and creating hierarchi- 
cal profiles of each scanned blank form. Each 
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son, one of the stored hierarchical form profiles 
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form hierarchical profile. Based on the identity 
of the corresponding hierarchical form profile, 
the completed form can be routed for further 
processing. A further aspect of the invention 
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ticular fonm within a completed fomn based on 
the form's identity. Furthermore, by using the 
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form, store the data with the form's identity, and 
display the completed fonm by drawing the 
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pective data fields. 
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